Mark Adams
Division of Psychiatry
mark.adams@ed.ac.uk
Genetics and Environmental Influences on Behaviour and Mental Health
Common: Affects 1% or more of the population
Complex: Inheritance cannot be explained by a single gene
Why use genetics to study mental health and psychiatric disorders?
Diagram showing the seven “characters” observed by Mendel
Adding up effects from a large number of genetic effects to make a continuous phenotype is related to the Central Limit Theorem.
Proportion of similarity in phenotypes that can be attributed to similarity in genotypes.
Model: Phenotype (P) = Genotype (G) + Environment (E)
Variance decomposition \[\mathrm{var}(P) = \mathrm{var}(G) + \mathrm{var}(𝐸)\]
Proportion of variance \[H^2 = \frac{\mathrm{var}(G)}{\mathrm{var}(𝑃)}, e^2 = \frac{\mathrm{var}(E)}{\mathrm{var}(𝑃)}, H^2 + e^2 = 1\]
Plot of child (offspring) height versus the average of their parents’ heights. What is a statistic that can be used to summarise the relationship between these two variables?
\(\beta = \frac{\mathrm{cov}(X, Y)}{\mathrm{var}(X)}\)
Estimate the beta coefficient (slope) for a simple regression from the covariance between predictor (\(X\)) and outcome (\(Y\)) variable divided by the variance of the predictor (\(X\)).
\[ P = A + E \]
The phenotype value \(P\) is influenced by an additive genetic effect \(A\) and and environmental effect \(E\).
\[ A = d + s \]
Each individual has two copies of the genome, one inherited from each parent.
Phenotype (\(P\)) value is the sum of the two genetic values plus an environmental value (\(e\)).
\(\beta = \frac{\mathrm{cov}(X, Y)}{\mathrm{var}(X)}\)
Therefore, \(\beta = \frac{\mathrm{cov}(\frac{P_d + P_s}{2}, P_o)}{\mathrm{var}(\frac{P_d + P_s}{2})}\)
\[ \mathrm{cov}(\frac{P_d + P_s}{2}, P_o) \]
\[ = \mathrm{cov}(\frac{d + d^\prime + e_d + s + s^\prime + e_s}{2}, d + s + e_o) \]
Expand the terms. Recall that:
\[ \mathrm{cov}(A+X,B+Y) = \\ \mathrm{cov}(A,B) + \mathrm{cov}(A,Y) + \mathrm{cov}(X,B) + \mathrm{cov}(X,Y) \] Thus we can do a pairwise expansion to: \[ = \mathrm{cov}(\frac{d}{2} + \frac{d^\prime}{2} + \frac{e_d}{2} + \frac{s}{2} + \frac{s^\prime}{2} + \frac{e_s}{2}, d + s + e_o) \] \[ = \mathrm{cov}(\frac{d}{2}, d) + \mathrm{cov}(\frac{d^\prime}{2}, d) + \dotsm+ \mathrm{cov}(\frac{e_s}{2}, e_o) \]
Some terms can be simplified.
Covariance between a genetic effect and itself \[ \mathrm{cov}(\frac{d}{2}, d), \mathrm{cov}(\frac{s}{2}, s) \]
Simplifies to:
\[ \mathrm{cov}(\frac{d}{2}, d) = \frac{1}{2}\mathrm{cov}(d, d) = \frac{1}{2}\mathrm{var}(d) \] \[ \mathrm{cov}(\frac{s}{2}, s) = \frac{1}{2}\mathrm{cov}(s, s) = \frac{1}{2}\mathrm{var}(s) \]
For some terms we might make an assumption that they are equal to 0.
Covariance between genetic effects from the same parent \[ \mathrm{cov}(\frac{d^\prime}{2}, d), \mathrm{cov}(\frac{s^\prime}{2}, s) \]
Covariance between genetic effects from different parents \[ \mathrm{cov}(\frac{d^\prime}{2}, s), \mathrm{cov}(\frac{s^\prime}{2}, d) \]
Covariance between parent and offspring environment effects \[ \mathrm{cov}(\frac{e_d}{2}, e_o), \mathrm{cov}(\frac{e_s}{2}, e_o) \]
Covariance between parental genetic and offspring environmental effects \[ \mathrm{cov}(\frac{d}{2}, e_o), \mathrm{cov}(\frac{s}{2}, e_o) \]
Using those assumptions the parent–offspring covariance simplifies to
\[ \mathrm{cov}(\frac{P_d + P_s}{2}, P_o) = \frac{\mathrm{var}(d) + \mathrm{var}(s)}{2} \]
The denominator in the regression equation was \[ \mathrm{var}(\frac{P_d + P_s}{2}) \]
Using the identity \[ \mathrm{var}(aX + bY) = a^2\mathrm{var}(X) + b^2\mathrm{var}(Y) + 2ab\mathrm{cov}(X, Y) \] the variance of the average parental phenotypes is: \[ \mathrm{var}(\frac{P_d + P_s}{2}) = \mathrm{var}(\frac{1}{2}P_d + \frac{1}{2} P_s) \] \[ = \left(\frac{1}{2}\right)^2\mathrm{var}(P_d) + \left(\frac{1}{2}\right)^2\mathrm{var}(P_s) + 2 \cdot \frac{1}{2} \cdot \frac{1}{2} \mathrm{cov}(P_d, P_s) \] \[ = \frac{1}{4}\mathrm{var}(P_d) + \frac{1}{4}\mathrm{var}(P_s) + \frac{1}{2} \mathrm{cov}(P_d, P_s) \]
If we assume as above that there is no covariation between parental effects (\(\mathrm{cov}(P_d, P_s) = 0\)), this simplifies to
\[ = \frac{\mathrm{var}(P_d) + \mathrm{var}(P_s)}{4} \]
Thus the regression equation is:
\[ \beta = \frac{\mathrm{cov}(\frac{P_d + P_s}{2}, P_o)}{\mathrm{var}(\frac{P_d + P_s}{2})} \\ = \frac{\frac{\mathrm{var}(d) + \mathrm{var}(s)}{2}}{\frac{\mathrm{var}(P_d) + \mathrm{var}(P_s)}{4}} \\ = 2\frac{\mathrm{var}(d) + \mathrm{var}(s)}{\mathrm{var}(P_d) + \mathrm{var}(P_s)} \]
Previously we defined
\[ A = d + s \] thus \[ \mathrm{var}(A) = \mathrm{var}(d) + \mathrm{var}(s) \] and assume variances in parental phenotypes are equal \[ \mathrm{var}(P_d) = \mathrm{var}(P_s) = \mathrm{var}(P) \]
Then substitute into the regression equation
\[ \beta = 2\frac{\mathrm{var}(d) + \mathrm{var}(s)}{\mathrm{var}(P_d) + \mathrm{var}(P_s)} \\ = 2 \frac{\mathrm{var}(A)}{\mathrm{var}(P) + \mathrm{var}(P)} \\ = 2 \frac{\mathrm{var}(A)}{2 \mathrm{var}(P)} \\ = \frac{\mathrm{var}(A)}{\mathrm{var}(P)} \\ = h^2 \]
Parent and offspring phenotypes become more highly correlated as heritability increases.
Mini review: What assumptions have we made when estimating \(h^2\)?
Heritability can also be estimated from resemblance between different types of related pairs. The general equation is:
\[ h^2 = \frac{b}{\mathrm{r}} \]
\(b\) = regression coefficient
\(\mathrm{r}\) = relatedness coefficient (“coefficient of additive variance”)
Correlation of depression scores for different pairs of relatives
\[ \lambda_\mathrm{R} = \frac{P(\mathrm{affected} | \mathrm{relative\ affected})}{P(\mathrm{affected\ in\ population})} = \frac{K_\mathrm{R}}{K} \]
Example:
\[ \mathrm{cov}(Y, Y_\mathrm{R}) = E[YY_\mathrm{R}] - E[Y] E[Y_\mathrm{R}] \\ = K \times K_\mathrm{R} - K^2 \\ \]
\[ \mathrm{cov}(Y, Y_\mathrm{R}) = E[YY_\mathrm{R}] - E[Y] E[Y_\mathrm{R}] \]
\[ = K \times K_\mathrm{R} - K^2 \\ = K(K_\mathrm{R} - K) \\ = K^2 (\frac{K_\mathrm{R}}{K} - 1) \\ = K^2 (\lambda_\mathrm{R} - 1) \]
\[ h^2 = \frac{\mathrm{cov}_\mathrm{R}}{\mathrm{r}V_\mathrm{P}} \\ = \frac{K^2 (\lambda_\mathrm{R} - 1)}{\mathrm{r}K(1-K)} \\ = \frac{K (\lambda_\mathrm{R} - 1)}{\mathrm{r}(1-K)} \\ \]
Contrast pairs of relatives that have comparable environmental similarity but different genetic similarity.
Add a shared (\(C\) or “common”) environment to the basic genetic model, to capture similarity between relatives attributable to environmental factors. \(E\) represents the unique, non-shared environment.
\[P = A + C + E\]
\[h^2 = \frac{\mathrm{var}(A)}{\mathrm{var}(P)}, c^2 = \frac{\mathrm{var}(C)}{\mathrm{var}(P)}, e^2 = \frac{\mathrm{var}(E)}{\mathrm{var}(P)}\]
\[h^2 + c^2 + e^2 = 1\]
MZ twins: \(r_\mathrm{MZ} = h^2 + c^2\)
DZ twins: \(r_\mathrm{DZ} = \frac{1}{2}h^2 + c^2\)
Calculate difference between MZ and DZ correlations
\[ r_\mathrm{MZ} - r_\mathrm{DZ} = (h^2 + c^2) - (\frac{1}{2}h^2 + c^2) \]
\[ r_\mathrm{MZ} - r_\mathrm{DZ} = h^2 - \frac{1}{2}h^2 + c^2 - c^2 \]
\[ r_\mathrm{MZ} - r_\mathrm{DZ} = \frac{1}{2}h^2 \]
\[ h^2 = 2(r_\mathrm{MZ} - r_\mathrm{DZ}) \]
Substitute \(h^2\) into MZ equation and solve for shared environment similarity (\(c^2\))
\[ r_\mathrm{MZ} = \underbrace{h^2} + c^2 \]
\[ r_\mathrm{MZ} = 2(r_\mathrm{MZ} - r_\mathrm{DZ}) + c^2 \]
\[ r_\mathrm{MZ} - 2(r_\mathrm{MZ} - r_\mathrm{DZ}) = c^2 \]
\[ c^2 = r_\mathrm{MZ} - 2r_\mathrm{MZ} + 2r_\mathrm{DZ} \]
\[ c^2 = 2r_\mathrm{DZ} - r_\mathrm{MZ} \]
Therefore from MZ and DZ twin correlations we can estimate:
\[ h^2 = 2(r_\mathrm{MZ} - r_\mathrm{DZ}) \\ c^2 = 2r_\mathrm{DZ} - r_\mathrm{MZ} \\ e^2 = 1 - h^2 - c^2 \]
Visualisation with r[MZ] = 0.75 and r[DZ] = 0.5.